Efficient conjugate gradient based channel estimator

ABSTRACT

In forming a current channel estimate from a received signal y, the received signal y is decoded to form data s, a convolution matrix Ŝis formed from a first portion of the data s, a matrix F 1  is formed from a second portion the data s such that the second portion of the data s includes data that is less recent that the data in the first portion of the data s, a matrix F 2  is formed from a third portion the data s such that the third portion of the data s includes data that is more recent than the data in the first portion of the data s, a predicted channel estimate h pred  is determined, and a conjugate gradient algorithm is performed to determine the current channel estimate. The conjugate gradient algorithm is based on the received signal y, the matrix Ŝ, the matrices F 1  and F 2 , the predicted channel estimate h pred , and a previous channel estimate h 1 .

RELATED APPLICATIONS

This application is a continuation-in-part of U.S. application Ser. No. 10/729,722 filed on Dec. 5, 2003.

TECHNICAL FIELD OF THE INVENTION

The present invention relates to a conjugate gradient based channel estimator.

BACKGROUND OF THE INVENTION

Since the adoption of the ATSC digital television (DTV) standard in 1996 in the United States, there has been an ongoing effort to improve the design of receivers built for the ATSC DTV signal. The primary obstacle that faces designers in designing receivers so that they achieve good reception is the presence of multipath interference in the channel.

The broadcast television channel is a relatively severe multipath environment due to a variety of conditions that are encountered in the channel and at the receiver. Channels are characterized by impulse responses which may be several hundreds of symbols long, so that strong interfering signals may arrive at the receiver both before and after the largest amplitude signal. In addition, the signal transmitted through the channel is subject to time varying channel conditions due to the movement of the transmitter and signal reflectors, airplane flutter, and, for indoor reception, people walking around the room. If mobile reception is desired, movement of the receiver must also be considered.

Moreover, the ATSC DTV signal uses trellis coded 8-level vestigial sideband (usually referred to as 8T-VSB or, more simply, as 8-VSB) as the modulation method. 8-VSB data symbols are real and have a signal pulse shape that is complex. Only the real part of the complex pulse shape is a Nyquist pulse. Therefore, even if there is no multipath, the imaginary part of the complex pulse shape contributes intersymbol interference (ISI) when the channel gain seen by the equalizer is not real.

Multipath, and intersymbol interference in general, adversely affect the ability of the receiver to correctly receive the symbols transmitted by the transmitter. Therefore, designers add equalizers to receivers in order to cancel the effects of multipath and intersymbol interference and thereby improve signal reception.

Because the channel is not known a priori at the receiver, the equalizer must be able to modify its response to match the channel conditions that it encounters and to adapt to changes in those channel conditions. To aid in the convergence of an adaptive equalizer to channel conditions, the field sync segment of the frame as defined in the ATSC standard may be used as a training sequence for the equalizer. But when equalization is done in the time domain, long equalizers (those having many taps) are required due to the long channel impulse responses that characterize the channel.

The original Grand Alliance receiver used an adaptive decision feedback equalizer (DFE) with 256 taps. This adaptive decision feedback equalizer was adapted to the channel using a standard least mean square (LMS) algorithm, and was trained with the field sync segment of the transmitted frame. The LMS algorithm converges quite slowly and, even with only 256 taps, does not always converge during a single training sequence. Because the field sync segment is transmitted relatively infrequently (about every 260,000 symbols), the total convergence time of this equalizer is quite long if the equalizer only adapts on training symbols prior to convergence.

Therefore, in order to adapt equalizers to follow channel variations that occur between training sequences, the addition of blind and decision directed methods to equalizers has been suggested. However, when implemented in a realistic system, these methods may require several data fields to achieve convergence, and convergence may not be achieved at all under difficult multipath conditions.

In any event, because multipath signals in the broadcast channel may arrive many symbols after the main signal, the decision feedback equalizer is invariably used in 8-VSB applications.

It has been argued that a blind decision feedback equalizer is required due to the rise in mean square error (MSE) between training sequences. However, adaptation of the trained equalizer in the simulation that supported this argument was frozen between training sequences. It is possible that a decision-directed equalizer with good tracking performance may be able to follow the channel variations tested.

Blind algorithms based on the Sato algorithm and on Godard's constant modulus algorithm (CMA) have been proposed. The error term in both of these algorithms uses a continuous blending of a decision-directed term with the blind term. This blending enables a smooth transition between the blind mode and the decision-directed mode. However, when implemented in a realistic system, these algorithms also may take several data fields to converge.

As mentioned previously, adaptive equalizers utilizing the least mean square (LMS) algorithm may converge slowly or not at all depending on the channel conditions. Convergence may be adversely affected if the input data auto-correlation matrix has a large eigenvalue spread. Also, if the decision feedback equalizer has not converged before the end of the training sequence, the shape of the objective function may change so that it includes local minima. These local minima may be caused by closed eye channel conditions and decision feedback equalizer error propagation.

The recursive least square (RLS) algorithm is well known to avoid these convergence problems. However, the recursive least square algorithm, in its basic form, requires the computationally intensive inversion of the input data auto-correlation matrix.

Lattice based forms of the recursive least square algorithm avoid the need for this matrix inversion. However, the lattice based forms of the recursive least square algorithm are not easily amenable to the advantage of initialization from an initial channel impulse response (CIR) estimate, even though such an initialization may be desirable.

Recent work in reduced rank filtering has connected the multi-stage nested Wiener filter (MSNWF) of Goldstein and Reed to the conjugate gradient algorithm (CG). It has been shown that the multi-stage nested Wiener filter solves the Wiener-Hopf equations in the Krylov subspace associated with the auto-correlation matrix and the cross-correlation vector. The multi-stage nested Wiener filter is then re-formulated using the Lanczos iteration. It has also been shown that the Lanczos-based multi-stage nested Wiener filter is equivalent to the conjugate gradient algorithm. Since the multi-stage nested Wiener filter often needs few dimensions to approach the performance of the full-rank Wiener filter, the conjugate gradient algorithm is a good candidate for an adaptive equalization algorithm with fast convergence.

A description of the conjugate gradient optimization algorithm and many of its mathematical properties may be found in “An Introduction to Optimization,” by E. K. P. Chong and S. Zak, New York, N.Y., John Wiley & Sons, 1996. The algorithm described is applicable to the optimization of a fixed objective function. An excellent review of adaptive filtering with the conjugate gradient algorithm is found in “Analysis of conjugate gradient algorithms for adaptive filtering,” by P. S. Chang and A. N. Willson, Jr., vol. 48, pp. 409-418, IEEE Transactions on Signal Processing, February 2000. The mathematical richness of the algorithm relationships leads to a variety of options in the implementation of the basic conjugate gradient algorithm for minimizing a quadratic objective function. These options can be carried over to the adaptive situation to provide a number of options for implementation of the algorithm. One characteristic of the algorithm which may also be exploited is that it works directly with the correlation matrices and vectors. This characteristic makes initialization based on an initial channel impulse response straightforward, assuming information is available for this purpose.

Like lattice based forms of the recursive least square algorithm, the conjugate gradient algorithm does not require the inversion of the input data auto-correlation matrix. Another characteristic of the conjugate gradient algorithm may be exploited by recognizing that the input data auto-correlation matrix is the product of two Toeplitz matrices.

The invention described in U.S. patent application Ser. No. (7227) uses these characteristics to substantially reduce the computation load in the conjugate gradient algorithm and to even eliminate the need to form the input auto-correlation matrix. Accordingly, the improved conjugate gradient algorithm disclosed in that application may be used to achieve satisfactory convergence times for equalizers.

The (7227) application also suggests that the conjugate gradient algorithm can be used to estimate channels. The present invention reduces the computation complexity in performing the conjugate gradient algorithm with respect to channel estimators.

SUMMARY OF THE INVENTION

In accordance with one aspect of the present invention, a method of processing a received signal y to produce a current channel estimate from one or more past channel estimates that include a previous channel estimate h₁ comprises the following: (a) decoding the received signal y to form data s; (b) forming a convolution matrix Ŝ from a first portion of the data s; (c) forming a matrix F₁ from a second portion the data s, wherein the second portion of the data s includes data that is less recent than the data in the first portion of the data s; (d) forming a matrix F₂ from a third portion of the data s, wherein the third portion of the data s includes data that is more recent than the data in the first portion of the data s; (e) determining a predicted channel estimate h_(pred) based on the one or more past channel estimates; and, (f) performing a conjugate gradient algorithm to determine the current channel estimate, wherein the conjugate gradient algorithm is based on the received signal y, the matrix Ŝ, the matrices F₁ and F₂, the predicted channel estimate h_(pred), and the previous channel estimate h₁.

In accordance with another aspect of the present invention, a method of processing a received signal y to produce a current channel estimate from one or more past channel estimates that include a previous channel estimate h₁ comprises the following: (a) decoding the received signal y to form data s; (b) forming a convolution matrix Ŝ from a first portion of the data s; (c) forming a matrix F₁ from a second portion of the data s, wherein the second portion of the data s includes data that is less recent than the data in the first portion of the data s; (d) forming a matrix F₂ from a third portion of the data s, wherein the third portion of the data s includes data that is more recent than the data in the first portion of the data s; (e) determining a predicted channel estimate h_(pred) based on the one or more past channel estimates; and, (f) performing a conjugate gradient algorithm to determine the current channel estimate, wherein the conjugate gradient algorithm is based on the received signal y, the matrix Ŝ, the matrices F₁ and F₂, the predicted channel estimate h_(pred), and the previous channel estimate h₁, wherein the conjugate gradient algorithm includes (i) forming FFTs based on the received signal y, the matrix Ŝ, and the matrices F1 and F₂, (ii) multiplying the FFTs to form a multiplication product, and (iii) forming an inverse FFT of the multiplication product.

BRIEF DESCRIPTION OF THE DRAWINGS

These and other features and advantages will become more apparent from a detailed consideration of the invention when taken in conjunction with the drawings in which:

FIG. 1 illustrates a filter that can be used in a decision feedback equalizer according to an embodiment of the present invention;

FIG. 2 illustrates the decision feedback equalizer according to an embodiment of the present invention;

FIG. 3 illustrates a matrix S which is derived from a vector of symbol decisions of length 2496 and is useful in describing the present invention;

FIG. 4 illustrates the decomposition of the matrix S to form two matrices Ŝ and F useful in implementing the present invention;

FIG. 5 shows a system for determining a channel estimate h based on the matrices Ŝ and F; and,

FIG. 6 illustrates the decomposition of the matrix F to form two matrices F₁ and F₂.

DETAILED DESCRIPTION

As shown in FIG. 1, a filter 10 can be used as an equalizer and includes storage registers 12[n], 12[n+N−3], 12[n+N−2], 12[n+N−1]. For example, N may be 512. The storage registers 12[n], . . . , 12[n+N−3], 12[n+N−2], 12[n+N−1] store corresponding received data y[n], y[n+N−3], y[n+N−2], y[n+N−1]. The output of each of the storage registers 12[n], . . . , 12[n+N−3], 12[n+N−2], 12[n+N−1] is coupled to a corresponding multiplier 14[n], 14[n+N−3], 14[n+N−2], 14[n+N−1]. Accordingly, the multipliers 14[n], . . . , 14[n+N−3], 14[n+N−2], 14[n+N−1] receive the stored data from the storage registers 12[n], 12[n+N−3], 12[n+N−2], 12[n+N−1]. The multipliers 14[n], . . . , 14[n+N−3], 14[n+N−2], 14[n+N−1] also receive corresponding tap weights g[n], . . . , g[n+N−3], g[n+N−2], g[n+N−1]. The multipliers 14[n], . . . , 14[n+N−3], 14[n+N−2], 14[n+N−1] multiply each of the stored data y[n], . . . , y[n+N−3], y[n+N−2], y[n+N−1] by a corresponding one of the tap weights g[n], . . . , g[n+N−3], g[n+N−2], g[n+N−1], and the multiplication results are summed by a summer 16 to provide the output of the filter 10. Various algorithms as described above have been implemented in order to set the values of the tap weights g[n], . . . , g[n+N−3], g[n+N−2], g[n+N−1].

A decision feedback equalizer 20 is shown in FIG. 2 and includes a feed forward filter 22, a feedback filter 24, a summer 26, a hard decision device 28, and a controller 30 that computes the tap weights for the feed forward filter 22 and the feedback filter 24. Each of the feed forward filter 22 and the feedback filter 24 may comprise the filter 10 of FIG. 1. The hard decision device 28, for example, may be a decoder or a slicer.

The output ŝ[n] of the summer 26 is the equalized output of the decision feedback equalizer 20. The hard decision device 28 (such as a slicer) determines the closest value {tilde over (s)}[n] for each of the symbols provided as the output ŝ[n] of the summer 26. These values {tilde over (s)}[n] are fed back as inputs to the feedback filter 24. The feedback filter 24 applies tap weights ĝ_(B)[n] to the values {tilde over (s)}[n] and provides its output to the summer 26. Likewise, the feed forward filter 22 applies tap weights ĝ_(F)[n] to the values y[n] of the received signal and provides its output to the summer 26. The summer 26 sums the output of the feedback filter 24 with the output of the feed forward filter 22 in order to produce the output ŝ[n].

The output ŝ[n] of the decision feedback equalizer 20 is given by the following equation: $\begin{matrix} \begin{matrix} {{\hat{s}\lbrack n\rbrack} = {{{Re}\left\{ {g_{F}^{H}{y\lbrack n\rbrack}} \right\}} + {g_{B}^{T}{\overset{\sim}{s}\lbrack n\rbrack}}}} \\ {= {{g_{FR}^{T}{y_{R}\lbrack n\rbrack}} + {g_{FI}^{T}{y_{I}\lbrack n\rbrack}} + {g_{B}^{T}{\overset{\sim}{s}\lbrack n\rbrack}}}} \end{matrix} & (1) \end{matrix}$ where {tilde over (s)}[n] as indicated above is defined to be the symbol estimates provided by the hard decision device 28, where g_(FR)=Re{g_(F)} is the real part of the tap weights g_(F) of the feed forward filter 22, where g_(FI)=Im{g_(F)} is the imaginary part of the tap weights g_(F) of the feed forward filter 22, and where g_(B) are the tap weights of the feedback filter 24.

The definitions given by the following equations may be assumed: {tilde over (y)} _(F) [n]=└{tilde over (y)} _(R) ^(T) [n]y _(I) ^(T) [n]┘ ^(T)  (2) {tilde over (y)}[n]=└{tilde over (y)} _(F) ^(T) [n]{tilde over (s)} ^(T) [n]┘ ^(T)  (3) {tilde over (g)} _(F) [n]=└g _(FR) ^(T) [n]g _(FI) ^(T) [n]┘ ^(T)  (4) {tilde over (g)}[n]=└{tilde over (g)} _(F) ^(T) [n]{tilde over (g)} _(B) ^(T) [n]┘ ^(T)  (5) where y_(R)=Re{y_(F)} is the real part of the received symbols y_(F), and where y₁=Im{y_(F)} is the imaginary part of the received symbols y_(F). Then, the equalized symbols ŝ[n] provided by the decision feedback equalizer 20 are given by the following equation: $\begin{matrix} \begin{matrix} {{\hat{s}\lbrack n\rbrack} = {{{\overset{\sim}{g}}_{F}^{T}{{\overset{\sim}{y}}_{F}\lbrack n\rbrack}} + {g_{B}^{T}{\overset{\sim}{s}\lbrack n\rbrack}}}} \\ {= {{\overset{\sim}{g}}^{T}{\overset{\sim}{y}\lbrack n\rbrack}}} \end{matrix} & (6) \end{matrix}$ The equalizer input y[n] and the feed forward tap weights g_(F) are complex values, the feedback tap weights g_(B) are real, and the output ŝ[n] of the decision feedback equalizer 20 is real. The imaginary part of the output of the feed forward filter 22, g_(FI), may be ignored and, therefore, not computed.

It is desired to determine the tap weight vector {tilde over (g)}[n] so that the error at the output of the decision feedback equalizer 20 is minimized according to the following equations: $\begin{matrix} {{\overset{\sim}{g}\lbrack n\rbrack} = {\begin{matrix} {\arg\quad\min} \\ g \end{matrix}\frac{1}{2}E\left\{ \left( {{s\lbrack n\rbrack} - {\hat{s}\lbrack n\rbrack}} \right)^{2} \right\}}} & (7) \\ \begin{matrix} {{F(g)} = {E\left\{ \left( {{s\lbrack n\rbrack} - {\hat{s}\lbrack n\rbrack}} \right)^{2} \right\}}} \\ {= {\sigma_{s}^{2} - {{{\overset{\sim}{g}}^{H}\lbrack n\rbrack}r_{s\overset{\sim}{y}}} - {r_{s\overset{\sim}{y}}^{H}{\overset{\sim}{g}\lbrack n\rbrack}} + {{{\overset{\sim}{g}}^{H}\lbrack n\rbrack}R_{\overset{\sim}{y}\overset{\sim}{y}}{g\lbrack n\rbrack}}}} \end{matrix} & (8) \end{matrix}$ where σ_(s) ² is the symbol variance and is equal to the average symbol energy, R_({tilde over (y)}{tilde over (y)}) is an autocorrelation matrix, and r_(s{tilde over (y)}) is a cross correlation vector. The autocorrelation matrix R_({tilde over (y)}{tilde over (y)}) and the cross correlation vector r_(s{tilde over (y)}) are defined in detail hereinafter. The error function F(g) is minimized by setting its gradient to zero according to the following equation: ∇F(g)=R _({tilde over (y)}{tilde over (y)}) {tilde over (g)}[n]− r _(s{tilde over (y)})=0  (9) This minimization leads to the following well known Wiener-Hopf equation: R _({tilde over (y)}{tilde over (y)}) {tilde over (g)}[n]= r _(s{tilde over (y)})  (10) Solving for the desired tap weights {tilde over (g)}[n] leads to the following equation: $\begin{matrix} {{\overset{\sim}{g}\lbrack n\rbrack} = {R_{\overset{\sim}{y}\overset{\sim}{y}}^{- 1}r_{s\overset{\sim}{y}}}} & (11) \end{matrix}$ So, in order to calculate the desired tap weight vector, the large system of equations represented by equation (11) must be solved. This solution requires the inversion of the large matrix R_({tilde over (y)}{tilde over (y)}) and, therefore, a considerable amount of computation.

One method for reducing the amount of computation involved in solving a large system of equations is the aforementioned conjugate gradient algorithm. In order to set up equation (11), estimation of the auto-correlation matrix R_(yy)[n=0] and the cross correlation vector r_(sy)[n=0] is required. By definition, the auto-correlation matrix R_(yy)[n] is given by the following equation: R _({tilde over (y)}{tilde over (y)}) [n]E[{tilde over (y)}[n]{tilde over (y)} ^(H) [n]]  (12) and the cross correlation vector r_(sy)[n] is given by the following equation: r _(s{tilde over (y)}) [n]=E[s[n]{tilde over (y)}[n]]  (13) where s[n] represents the transmitted symbols, E is the expectation operator, and H denotes the Hermitian transpose.

The auto-correlation matrix R_({tilde over (y)}{tilde over (y)})[n] and the cross correlation vector r_(s{tilde over (y)})[n] can be estimated using the forgetting factor method according to the following equations: $\begin{matrix} \begin{matrix} {{{\hat{R}}_{\overset{\sim}{y}\overset{\sim}{y}}\lbrack n\rbrack} = {\sum\limits_{m = 0}^{n - 1}{\gamma^{m}{\overset{\sim}{y}\left\lbrack {n - m} \right\rbrack}{{\overset{\sim}{y}}^{H}\left\lbrack {n - m} \right\rbrack}}}} \\ {= {{\gamma{{\hat{R}}_{yy}\left\lbrack {n - 1} \right\rbrack}} + {{\overset{\sim}{y}\lbrack n\rbrack}{{\overset{\sim}{y}}^{H}\lbrack n\rbrack}}}} \end{matrix} & (14) \\ \begin{matrix} {{{\hat{r}}_{s\overset{\sim}{y}}\lbrack n\rbrack} = {\sum\limits_{m = 0}^{n - 1}{\gamma^{m}{s\left\lbrack {n - m} \right\rbrack}{{\overset{\sim}{y}}^{H}\left\lbrack {n - m} \right\rbrack}}}} \\ {= {{\gamma{{\hat{r}}_{s\overset{\sim}{y}}\left\lbrack {n - 1} \right\rbrack}} + {{s\lbrack n\rbrack}{{\overset{\sim}{y}}^{H}\lbrack n\rbrack}}}} \end{matrix} & (15) \end{matrix}$ where 0<γ<1 is the forgetting factor. If γ is set to 0, the equations (14) and (15) yield the estimates of the auto-correlation matrix {circumflex over (R)}_({tilde over (y)}{tilde over (y)}) and the cross correlation vector {circumflex over (r)}_(s{tilde over (y)}) from the most recent sample vector y.

In determining the tap weights g_(F) for the feed forward filter 22 and g_(B) for the feedback filter 24, the controller 30 initializes the auto-correlation matrix R_({tilde over (y)}{tilde over (y)})[0] using equation (14) and the cross correlation vector r_(s{tilde over (y)})[0] using equation (15) by performing averaging over a period of time such as ½ data field or so. The controller 30 also initializes the tap weights {tilde over (g)}₀[0]=0 and initializes the decision vector s[0] to the first N_(fb) of the known training symbols that are contained in the field sync portion of each data field as discussed above.

Further, the controller 30 initializes the data vector {tilde over (y)}[0] as follows. The data vector {tilde over (y)}_(F)[0] as given in equation (2) is initialized by setting the real part y_(R)[n=0]={y_(R)[0], . . . , y_(R)[N_(ff)−1]}^(T) such that y_(R)[0], . . . , y_(R)[N_(ts)−N_(fb)−1] are the real part of the remaining training data (after those in {tilde over (s)}[0]) and y_(R)[N_(ts)−N_(fb)], . . . , y_(R)[N_(ff)−1] are the real part of the received data after the training sequence. This setting is repeated for {tilde over (y)}_(I)[0]. Then, {tilde over (y)}_(F)[0]=[y_(R) ^(T)[0]y_(I) ^(T)[0]]^(T) is constructed as given in equation (2) and {tilde over (y)}[0]=[{tilde over (y)}_(F) ^(T)[0]{tilde over (s)}^(T)[0]]^(T) is constructed as given in equation (3) based on the initialized vectors {tilde over (y)}_(F)[0], {tilde over (y)}_(R)[0], and {tilde over (y)}_(I)[0] as given above. In an ATSC digital television environment, for example, the number of training symbols N_(ts) may be 728, the number of feed forward taps N_(ff) may be 512, and the number of feedback taps N_(fb) may also be 512.

Further, the controller 30 initializes the output ŝ[n] of the decision feedback equalizer 20 as ŝ[0]={tilde over (g)}₀ ^(T)[0]{tilde over (y)}[0] in accordance with equation (6), the controller 30 initializes a gradient t_(n)[n] as t₀[0]={circumflex over (R)}_({tilde over (y)}{tilde over (y)})[0]{tilde over (g)}₀[0]−{circumflex over (r)} _(sy)[0], and the controller 30 sets the variable n to 0.

The controller 30 then executes a conjugate gradient algorithm once for each received data element, where each iteration of the algorithm is initiated upon the shifting of a new received data element into the feed forward filter 22. The parameters of the conjugate gradient algorithm as described above are defined in the table. TABLE Parameter Name Equalizer Notation Object Function Argument g_(k)[n] (Equalizer) Argument Step Size α_(k)[n] Gradient t_(k)[n] Conjugate Directions b_(k)[n] Conjugate Direction Step β_(k)[n] Size

This conjugate gradient algorithm is as follows:

-   -   1. calculate conjugate direction b_(k)[n]         -   if n modulo(N_(taps))=0             -   b₀[n]=−t₀[0] (reset)         -   else             ${\beta_{0}\lbrack n\rbrack} = \frac{\left( {{t_{o}\lbrack n\rbrack} - {t_{- 1}\lbrack n\rbrack}} \right)^{H}{t_{0}\lbrack n\rbrack}}{\left( {{t_{0}\lbrack n\rbrack} - {t_{- 1}\lbrack n\rbrack}} \right)^{H}{b_{- 1}\lbrack n\rbrack}}$         -   b₀[n]=−t₀[n]+β₀[n]b⁻¹[n]     -   2. calculate argument step size α_(k)[n]         ${\alpha_{0}\lbrack n\rbrack} = \frac{{- {b_{0}^{H}\lbrack n\rbrack}}{t_{0}\lbrack n\rbrack}}{{b_{0}^{H}\lbrack n\rbrack}{{\hat{R}}_{\overset{\sim}{y}\overset{\sim}{y}}\lbrack n\rbrack}{b_{0}\lbrack n\rbrack}}$     -   3. tap update         -   {tilde over (g)}[n]={tilde over (g)}₀[n]+α₀[n]b₀[n]     -   4. n=n+1         -   {tilde over (g)}₀[n]={tilde over (g)}₁[n−1]         -   t⁻¹[n]=t₀[n−1]         -   b⁻¹[n]=b₀[n−1]     -   5. update input vector {tilde over (y)}[n] with the next         received data element and update {circumflex over         (R)}_({tilde over (y)}{tilde over (y)})[n] per equation (14)     -   6. Calculate output of the decision feedback equalizer 20         -   ŝ[n]={tilde over (g)}₀ ^(T)[n]{tilde over (y)}[n]     -   7. calculate gradient t_(k)[n]         -   t₀[n]=t⁻¹[n]+α₀[n]└{circumflex over             (R)}_({tilde over (y)}{tilde over (y)})[n]b₀[n]     -   8. Return to step 1

The largest of the part of the computations required by this conjugate gradient algorithm is contributed by the matrix-vector multiplications {circumflex over (R)}_({tilde over (y)}{tilde over (y)})[n]b₀[n] of steps 2 and 7. These multiplications require the update of matrix {circumflex over (R)}_({tilde over (y)}{tilde over (y)})[n] from the vector {tilde over (y)}[n] as in equation (14). This update requires one multiply for each of the N_(taps) ² matrix elements, a total of N_(taps) ² multiplies. Then, the matrix-vector multiplication requires an additional N_(taps) ² multiplies. Accordingly, the total computation is N_(taps) ²+N_(taps) ²=2N_(taps) ² multiplies.

This computational load can be decreased as explained below.

A matrix-vector multiplication given by a=Yb, where Y is a convolution matrix, b is a vector, and a is the multiplication result, is equivalent to a[n]=y[n]*b[n] where * denotes linear convolution. These equations may be represented in matrix form by the following equation: $\begin{matrix} {\begin{bmatrix} {a\lbrack 0\rbrack} \\ {a\lbrack 1\rbrack} \\ {a\lbrack 2\rbrack} \end{bmatrix} = {\begin{bmatrix} {y\lbrack 0\rbrack} & {y\left\lbrack {- 1} \right\rbrack} & {y\left\lbrack {- 2} \right\rbrack} \\ {y\lbrack 1\rbrack} & {y\lbrack 0\rbrack} & {y\left\lbrack {- 1} \right\rbrack} \\ {y\lbrack 2\rbrack} & {y\lbrack 1\rbrack} & {y\lbrack 0\rbrack} \end{bmatrix}\begin{bmatrix} {b\lbrack 0\rbrack} \\ {b\lbrack 1\rbrack} \\ {b\lbrack 2\rbrack} \end{bmatrix}}} & (16) \end{matrix}$ where n is chosen as a small number for explanatory purposes only. The columns and rows of the matrix Y in equation (16) can be circularly extended in accordance with the following equation: $\begin{matrix} {\begin{bmatrix} {a\lbrack 0\rbrack} \\ {a\lbrack 1\rbrack} \\ {a\lbrack 2\rbrack} \\ {dc} \\ {dc} \end{bmatrix} = {\begin{bmatrix} {y\lbrack 0\rbrack} & {y\left\lbrack {- 1} \right\rbrack} & {y\left\lbrack {- 2} \right\rbrack} & {y\lbrack 2\rbrack} & {y\lbrack 1\rbrack} \\ {y\lbrack 1\rbrack} & {y\lbrack 0\rbrack} & {y\left\lbrack {- 1} \right\rbrack} & {y\left\lbrack {- 2} \right\rbrack} & {y\lbrack 2\rbrack} \\ {y\lbrack 2\rbrack} & {y\lbrack 1\rbrack} & {y\lbrack 0\rbrack} & {y\left\lbrack {- 1} \right\rbrack} & {y\left\lbrack {- 2} \right\rbrack} \\ {y\left\lbrack {- 2} \right\rbrack} & {y\lbrack 2\rbrack} & {y\lbrack 1\rbrack} & {y\lbrack 0\rbrack} & {y\left\lbrack {- 1} \right\rbrack} \\ {y\left\lbrack {- 1} \right\rbrack} & {y\left\lbrack {- 2} \right\rbrack} & {y\lbrack 2\rbrack} & {y\lbrack 1\rbrack} & {y\lbrack 0\rbrack} \end{bmatrix}\begin{bmatrix} {b\lbrack 0\rbrack} \\ {b\lbrack 1\rbrack} \\ {b\lbrack 2\rbrack} \\ 0 \\ 0 \end{bmatrix}}} & (17) \end{matrix}$ where dc means don't care. Equation (17) is equivalent to a[n]=y[n] {circle over (×)}b[n] where {circle over (×)} denotes circular convolution. Circular convolution is equivalent to the following operation: γ=FFT(y)  (18)

=FFT(b)  (19)

=γ×

  (20) a=IFFT(

)  (21) where FFT in equations (18) and (19) indicates the Fast Fourier Transform operation, IFFT in equation (21) indicates the Inverse Fast Fourier Transform operation, and x in equation (20) indicates point-wise vector multiplication such that the elements of the resultant vector are given by Y₀×B₀, Y₁×B₁, Y₂×B₂, etc.

An embellishment on the above process involves 0 padding the columns to lengths of 2^(n). This 0 padding, as is well known, permits a computationally efficient FFT. Accordingly, the matrix-vector multiplication a=Yb can be written with 0 padding in matrix form according to the following equation: $\begin{matrix} {\begin{bmatrix} {a\lbrack 0\rbrack} \\ {a\lbrack 1\rbrack} \\ {a\lbrack 2\rbrack} \\ {dc} \\ {dc} \\ {dc} \\ {dc} \\ {dc} \end{bmatrix} = {\begin{bmatrix} {y\lbrack 0\rbrack} & {y\left\lbrack {- 1} \right\rbrack} & {y\left\lbrack {- 2} \right\rbrack} & 0 & 0 & 0 & {y\lbrack 2\rbrack} & {y\lbrack 1\rbrack} \\ {y\lbrack 1\rbrack} & {y\lbrack 0\rbrack} & {y\left\lbrack {- 1} \right\rbrack} & {y\left\lbrack {- 2} \right\rbrack} & 0 & 0 & 0 & {y\lbrack 2\rbrack} \\ {y\lbrack 2\rbrack} & {y\lbrack 1\rbrack} & {y\lbrack 0\rbrack} & {y\left\lbrack {- 1} \right\rbrack} & {y\left\lbrack {- 2} \right\rbrack} & 0 & 0 & 0 \\ 0 & {y\lbrack 2\rbrack} & {y\lbrack 1\rbrack} & {y\lbrack 0\rbrack} & {y\left\lbrack {- 1} \right\rbrack} & {y\left\lbrack {- 2} \right\rbrack} & 0 & 0 \\ 0 & 0 & {y\lbrack 2\rbrack} & {y\lbrack 1\rbrack} & {y\lbrack 0\rbrack} & {y\left\lbrack {- 1} \right\rbrack} & {y\left\lbrack {- 2} \right\rbrack} & 0 \\ 0 & 0 & 0 & {y\lbrack 2\rbrack} & {y\lbrack 1\rbrack} & {y\lbrack 0\rbrack} & {y\left\lbrack {- 1} \right\rbrack} & {y\left\lbrack {- 2} \right\rbrack} \\ {y\left\lbrack {- 2} \right\rbrack} & 0 & 0 & 0 & {y\lbrack 2\rbrack} & {y\lbrack 1\rbrack} & {y\lbrack 0\rbrack} & {y\left\lbrack {- 1} \right\rbrack} \\ {y\left\lbrack {- 1} \right\rbrack} & {y\left\lbrack {- 2} \right\rbrack} & 0 & 0 & 0 & {y\lbrack 2\rbrack} & {y\lbrack 1\rbrack} & {y\lbrack 0\rbrack} \end{bmatrix}\begin{bmatrix} {b\lbrack 0\rbrack} \\ {b\lbrack 1\rbrack} \\ {b\lbrack 2\rbrack} \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \end{bmatrix}}} & (22) \end{matrix}$ It is noted that matrix Y has a Toeplitz structure, and that Y^(H) is also a Toeplitz matrix. (H denotes Hermitian transpose.)

It will be shown that the need for both the R_({tilde over (y)}{tilde over (y)}) update and the R_({tilde over (y)}{tilde over (y)}) multiply can be eliminated, and that the matrix R_({tilde over (y)}{tilde over (y)}) need not even be created.

Let γ=0 for equation (14). It is well known that R _({tilde over (y)}{tilde over (y)}) ={tilde over (y)}{tilde over (y)} ^(H) =YY ^(H) where Y is a convolution matrix based on the vector y (that is, y forms the first column of the matrix Y). The vector {tilde over (y)} may be 0 padded to length 2^(n). Let {tilde over (y)}′ be the first column of Y^(H). For the conjugate gradient operation, the calculation given by the following equation must be made: ν=R _({tilde over (y)}{tilde over (y)}) [n]b ₀ [n]=Y ^(H) Yb ₀ [n]  (23) In order to calculate ν, the quantity Y′b₀[n] should first be calculated according to the following equations: γ=FFT(y)  (24)

=FFT(b)  (25) X=γ×  (26) x=IDFT(X)  (27) x_(N)=first N elements of x, remaining elements=0  (28) X _(N)=FFT(x _(N))  (29) where x in equation (26) denotes point-wise vector multiplication. Then, Y^(H)×X_(N) is calculated according to the following equations: γ′=FFT(y′)  (30)=

=γ′×X   (31) ν=IDFT  (32) ν_(N)=first N elements of ν, remaining elements=0  (33) The number of multiplies in this process, particularly for large N_(taps), is much less that 2N_(taps) ².

In the process given by equations (24)-(33), b is the vector b₀ of the eight step algorithm described above, y is a vector of received symbols as described above, and ν_(N) is the quantity {circumflex over (R)}_({tilde over (y)}{tilde over (y)})[n]b₀[n] in steps 2 and 7 of the conjugate gradient algorithm described above.

The conjugate gradient algorithm described above can be used to make other calculations. For example, the conjugate gradient algorithm described above can be used to calculate updated channel impulse response estimates. This calculation can be useful, for instance, because known alternative methods of calculating MMSE (Minimum Mean Square Error) equalizer tap weights are based on having a channel impulse response estimate.

The conjugate gradient method described above permits the continuous calculation of channel impulse response estimate updates between training sequences (when only a priori unknown symbols are received) in a receiver for a time varying multipath channel. This calculation utilizes a least squares channel impulse response estimation calculated from an over-determined system of equations. Dynamic changes to the channel impulse response may be tracked by using receiver trellis decoder decisions on input symbols to form a long sequence of almost perfectly known symbols. This sequence should have relatively few errors.

The channel impulse response to be estimated is of length L_(h)=L_(ha)+L_(hc)+1 where L_(ha) is the length of the anti-causal part of the channel impulse response and L_(hc) is the length of the causal part of the channel impulse response.

The reliable trellis decoder decisions on the input symbols is a vector designated herein as s of length L_(s). A Toeplitz matrix S may be defined based on this vector s according to the following equation: $\begin{matrix} {S = \begin{bmatrix} S_{{Lh} - 1} & S_{{Lh} - 2} & \cdots & \cdots & S_{0} \\ \vdots & S_{{Lh} - 1} & \cdots & \cdots & \vdots \\ \vdots & \vdots & \quad & \vdots & \vdots \\ \vdots & \vdots & \quad & \vdots & S_{{Lh} - 1} \\ \vdots & \vdots & \quad & S_{{Ls} - {Lh}} & \vdots \\ S_{{Ls} - 1} & S_{{Ls} - 2} & \cdots & \cdots & S_{{Ls} - {Lh}} \end{bmatrix}} & (34) \end{matrix}$ where the elements in the matrix of equation (34) are real and consist of the symbol decisions of vector s.

To ensure an over-determined system of equations, the following inequality is necessary: L _(S)>2L _(h)−1.  (35) The matrix S is of dimension (L_(S)−L_(h)+1)×L_(h) with (L_(S)−L_(h)+1)≧L_(h). The received signal vector is y with data elements y_(i) for L_(hc)≦i≦(L_(S)−L_(ha)−1) where y_(i) is the received symbol corresponding to input symbol decision s_(i). Then, y=Sh+u where h is the L_(h) long channel impulse response vector and u is a white noise vector.

The least squares solution for h is given by the following equation: ĥ_(LS)=(S ^(T) S)⁻¹ S ^(T) y.  (36)

By utilizing reliable trellis decoder input symbol decisions, there should be sufficient support for calculating a channel impulse response estimate of the required length. As required by inequality (35), the vector s of symbol decisions must be at least twice as long as the channel impulse response being estimated.

For an 8 VSB receiver application, the following parameters may be assumed: L_(h)=512, L_(ha)=63, L_(hc)=448, and L_(S)=2496. The vector s may be formed from a sequence of trellis decoder decisions on the input symbols. Normally, the trellis decoder would just make output bit pair decisions, but it can also make equally reliable decisions on the input symbols.

The vector s, for example, may be selected as 3 segments (L_(S)=2496 symbols) long. So, three data segments may be used to produce a single channel impulse response estimate update. A new channel impulse response estimate update can be obtained once per segment by proceeding in a sliding window manner. Optionally, several consecutive channel impulse response estimate updates can be averaged in order to further improve channel impulse response estimate accuracy if necessary. Of course, this additional step of averaging can be a problem if the channel impulse response is varying rapidly.

A vector b with fewer than 3 segments of symbol decisions may be used, but as stated in inequality (35), the length of vector b must be at least twice as long as the channel impulse response to be estimated. Longer b vectors help to diminish the adverse effect of additive white Gaussian noise on the channel.

The system of equations represented by equation (36) may be solved using the conjugate gradient algorithm in a manner similar to that previously described to solve equation (11) for the MMSE tap weights where S^(T)y in equation (36) takes the place of r_(s{tilde over (y)}) in equation (11), where S^(T)S in equation (36) takes the place of R_({tilde over (y)}{tilde over (y)}) in equation (11), and where ĥ_(LS) in equation (36) takes the place of {tilde over (g)} in equation (11).

So, the conjugate gradient algorithm described above can facilitate the computation of (S^(T)S)b for the case of the channel impulse response estimation just as it facilitated the computation of R_({tilde over (y)}{tilde over (y)})b for the calculation of tap weights.

In further reducing the computation complexity necessary to implement the matrix-vector multiplications required to perform the conjugate gradient algorithm, such as when the conjugate gradient algorithm is used to estimate a channel as described above, it is useful to note that the conjugate gradient algorithm may be used to solve the following linear system for the channel estimate h: Sh=y.  (37) The matrix S is an m×n (e.g., 1985×512) Toeplitz matrix derived from symbol decisions, and y is a vector (the observation vector) derived from the received data. The symbol decisions can be made, for example, by a trellis decoder or a slicer.

The system may be written as S^(T)Sh=S^(T)y by multiplying both sides of equation (37) by S^(T). The conjugate gradient algorithm can be used to solve this system for the channel estimate h according to the following steps:

-   (1) Calculate the residual vector r₁ representing the system     S^(T)y−S^(T)Sh₁ as follows:     r ₁ =S ^(T) y−S ^(T) Sh ₁  (38)     where r₁ is a 512×1 vector, and where h₁ is the previous channel     estimate which may be initialized to any desired value such as h₁=0. -   (2) For k=1 to n, iteratively calculate the following:     (a) d _(k) =r _(k)+β_(k) d _(k−1)  (39)     where β₁=0,     ${\beta_{k \geq 2} = \frac{r_{k}^{T} \cdot r_{k}}{r_{k - 1}^{T} \cdot r_{k - 1}}},$     and where d_(k) is a 512×1 vector;     (b) h _(k+1) =h _(k) +α _(k) d _(k)  (40)     where h_(k+1) is a 512×1 vector, where α_(k) is 1×1, where     ${\alpha_{k} = \frac{r_{k}^{T} \cdot r_{k}}{d_{k} \cdot q_{k}}},$     where q_(k)=S^(T)Sd_(k), where S^(T) is a 512×1985 matrix, and where     S is a 1985×512 matrix; and,     (c) r _(k+1) =r _(k) −α _(k) q _(k)  (41)     where r_(k+1) is a 512×1 vector.

Steps 2(a)-(c) are repeated enough times until a desired accuracy is achieved for the channel estimate h. The index k, for example, may be incremented up to a value of 5 although any other desired value can be used.

It will be seen that step 2(b) includes the matrix vector multiplication S^(T)Sd_(k). This operation is computationally very complex. A technique for reducing the computational complexity of implementing this operation is suggested above with respect to equations (23)-(33).

A straightforward approach to this technique would be to form the FFT of each of the three components S^(T), S, and d_(k), point-wise vector multiply the three resulting FFTs, and then take the inverse FFT of the multiplication result. This series of calculations, however, does not provide an accurate answer because the matrix S is not a true convolution matrix, and the IFFT of the triple product, therefore, produces a corrupted result.

FIG. 3 illustrates the matrix S which is derived from a vector of symbol decisions of length 2496 (three segments of a VSB field). The first row of the matrix S contains symbol decision 512 in the first column down to symbol decision 1 in the last column, the second row of the matrix S contains symbol decision 513 in the first column down to symbol decision 2 in the last column, and so on. The portions of the matrix S which prevent it from being a true convolution matrix are the triangular areas 40 and 42. The diagonal lines in FIG. 3 are parallel.

According to the present invention, and as shown in FIG. 4, matrix S of FIG. 3 is decomposed to form two matrices Ŝ and F such that the matrix Ŝ is a true convolution matrix. Accordingly, the area 44 of the matrix Ŝ is the same as the area 44 of the matrix S, and the symbol decisions in the areas 40 and 42 of the matrix Ŝ are set to zero. The symbol decisions in the area 44 of the matrix F are set to zero, and the areas 40 and 42 of the matrix F are the same as the areas 40 and 42 of the matrix S. A new linear system based on Ŝ and F is then defined which may be solved for the channel estimate h. Advantageously, FFT operations of Ŝ may be used for implementing the matrix vector multiplication of the conjugate gradient algorithm without corruption because Ŝ is a true convolution matrix.

More specifically, because the linear system is defined by y=Sh, the following equations result: y=(Ŝ+F)h  (42) y=Ŝh+Fh  (43) y−Fh=Ŝh  (44) Defining y−Fh₁=ŷ, where h₁ is a previous channel estimate derived by any known technique, the new system expression becomes ŷ=Ŝh. This expression can be written as Ŝ^(T)ŷ=Ŝ^(T)Ŝh. The conjugate gradient algorithm can then be applied to solve for h in the new system as follows:

-   -   (1) ŷ=y−Fh₁,         -   r₁=Ŝ^(T)ŷ−Ŝ^(T)Ŝh₁     -   (2) For k=1 to n, iteratively calculate         -   (a) d_(k)=r_(k)+β_(k)d_(k−1)         -   (b) h_(k+1)=h_(k)+α_(k)d_(k)         -   (c) r_(k+1)=r_(k)−α_(k)q_(k−1)             where h₁ may be initialized to any desired value such as 0,             and where d₀, r_(o), β_(k), and α_(k) are as previously             defined.

As indicated above following equation (40), step 2(b) includes calculating the expression q_(k)=Ŝ^(T)Ŝd_(k). This multiplication may be conveniently solved by forming the FFT of (i) the matrix Ŝ, (ii) d_(k), and (iii) Ŝ^(T). The three FFT results are then point-wise vector multiplied, and the inverse FFT (IFFT) is taken of the multiplication result. The first N elements of this IFFT produces q_(k). N, for example, can have a value of 512 for a 1024 point FFT/IFFT. However, N can have other values as desired.

FIG. 5 shows a system 50 for implementing this technique so as to determine a channel estimate h. The system 50 includes a decoder 52 that decodes the received data y to produce the symbols s. As described above, the decoder 52 may be a trellis decoder. However, the decoder 52 may be other forms of decoders such as slicers. A channel estimator 54 produces a channel estimate h using the techniques described above.

As discussed above, the received data vector y may be related to the symbol decisions s by the following equation: y=Sh+u  (45) where S is matrix formed of the symbol decisions and is of dimension (L_(S)−L_(h)+1)×L_(h) with (L_(S)−L_(h)+1)≧L_(h), where h is the L_(h) long channel impulse response vector, and where u is a white noise vector. Ignoring the noise vector, equation (45) may be rewritten as equations (42)-(44) where S is shown in FIG. 3, Ŝ and F are shown in FIG. 4, and S=Ŝ+F. As given above, ŷ=y−Fh₁ where h₁ is the previous channel estimate, and the following new system of equations is solved for the channel estimate: ŷ=Ŝh  (46)

However, the previous channel estimate h₁ may be too outdated to be used in the equation ŷ=y−Fh₁. More particularly, the lower left hand corner of the matrix F has the newest (most recent) symbol decisions, and the previous channel estimate h₁ may not be valid for these most recent symbol decisions, such as in a dynamic channel environment where multipath taps may have Doppler shifts. It is, therefore, desirable to have a more accurate channel vector than the previous estimate, which may be too old.

A simple remedy to this problem is to separate the matrix F into two matrices as defined in the following equations: F=F ₁ +F ₂  (47) where FIG. 6 illustrates the decomposition of the matrix F to form the two matrices F₁ and F₂. In addition, a predicted channel vector denoted h_(pred) may be redefined according to the following equation: ŷ=y−F ₁ h ₁ −F ₂ h _(pred)  (48) Then, equation (46) can be solved for h using the new ŷ in the conjugate gradient algorithm and FFT process described above.

Several methods can be used to determine the predicted channel estimate h_(pred). For example, a simple linear extrapolation may be used based on the use of past k channel estimates where k≧2. If k=2, for example, the predicted channel vector can be obtained by defining a difference vector Δ according to the following equation: Δ=h _(k−1) −h _(k−2)  (49) Then, the following equations result: h _(pred) =h _(k−1)+Δ  (50) h _(pred) =h _(k−1)+(h _(k−1) −h _(k−2))  (51) h _(pred)=2h _(k−1) −h _(k−2)  (52) In the case where k=2, equation (52) requires the storage of k=2 past channel estimations, namely h_(k−1) and h_(k−2). It is noted that h₁=h_(k−1). Using simple linear extrapolation can be justified by considering a ghost (multipath) caused by a vehicle (truck, airplane, etc.) moving at a nearly constant speed. In this case, if the time span between two consecutive channel estimates are close enough, a linear extrapolation will work satisfactorily. In the case of digital television, the channel estimate is updated once per 832 symbols (i.e., once per segment or every 77 microseconds) so that a linear interpolation should be sufficient to improve the channel estimate.

Modifications of the present invention will occur to those practicing in the art of the present invention. For example, the Fast Fourier Transform (FFT) and Inverse Fast Fourier Transform (IFFT) are used in the present invention as described above. Alternatively, the Discrete Fourier Transform (DFT) and Inverse Discrete Fourier Transform (IDFT) can be used instead of the FFT and IFFT.

Accordingly, the description of the present invention is to be construed as illustrative only and is for the purpose of teaching those skilled in the art the best mode of carrying out the invention. The details may be varied substantially without departing from the spirit of the invention, and the exclusive use of all modifications which are within the scope of the appended claims is reserved. 

1. A method of processing a received signal y to produce a current channel estimate from one or more past channel estimates that include a previous channel estimate h₁ comprising: (a) decoding the received signal y to form data s; (b) forming a convolution matrix Ŝ from a first portion of the data s; (c) forming a matrix F₁ from a second portion the data s, wherein the second portion of the data s includes data that is less recent than the data in the first portion of the data s; (d) forming a matrix F₂ from a third portion the data s, wherein the third portion of the data s includes data that is more recent than the data in the first portion of the data s; (e) determining a predicted channel estimate h_(pred) based on the one or more past channel estimates; and, (f) performing a conjugate gradient algorithm to determine the current channel estimate, wherein the conjugate gradient algorithm is based on the received signal y, the matrix Ŝ, the matrices F₁ and F₂, the predicted channel estimate h_(pred), and the previous channel estimate h₁.
 2. The method of claim 1 wherein the determining of a predicted channel estimate comprises extrapolating the predicted channel estimate h_(pred) from k of the past channel estimates.
 3. The method of claim 2 wherein k≧2.
 4. The method of claim 1 wherein the received signal y, the matrix Ŝ, the matrices F₁ and F₂, the predicted channel estimate h_(pred), and the previous channel estimate h₁ are related according to the following equation: ŷ=y−F ₁ h ₁ −F ₂ h _(pred).
 5. The method of claim 4 wherein the conjugate gradient algorithm is performed to solve the following equation: ŷ=Sh wherein h comprises the current channel estimate.
 6. The equalizer of claim 4 wherein the performing of a conjugate gradient algorithm to determine the current channel estimate comprises performing the following algorithm: (1) ŷ=y−F₁h₁−F₂h_(pred), r₁=Ŝ^(T)ŷ−Ŝ^(T)Ŝh₁ (2) For k=1 to n, iteratively calculate (a) d_(k)=r_(k)+β_(k)d_(k−1) (b) h_(k+1)=h_(k)+α_(k)d_(k) (c) r_(k+1)=r_(k)−α_(k)q_(k−1) where β₁=0, ${\beta_{k \geq 2} = \frac{r_{k}^{T} \cdot r_{k}}{r_{k - 1}^{T} \cdot r_{k - 1}}},$ where ${\alpha_{k} = \frac{r_{k}^{T} \cdot r_{k}}{d_{k} \cdot q_{k}}},$ and where q_(k)=S^(T)Sd_(k).
 7. The method of claim 6 wherein q_(k) is determined by forming a first FFT of the matrix Ŝ, by forming a second FFT of the matrix Ŝ^(T), by forming a third FFT of d_(k), by multiplying the first, second, and third FFTs to produce a multiplication result, and by forming an inverse FFT of the multiplication result.
 8. The method of claim 6 wherein the forming of a matrix Ŝ from the data s comprises: forming a matrix S from the data s, wherein the matrix S contains the first, second, and third portions of the data s; and, forming the matrix Ŝ from the matrix S by setting the second and third portions of the data s to zero; wherein the forming of a matrix F₁ from the data s comprises: forming the matrix F₁ from the matrix S by setting the first and third portions of the data s to zero; and, wherein the forming of a matrix F₂ from the data s comprises: forming the matrix F₂ from the matrix S by setting the first and second portions of the data s to zero.
 9. The method of claim 8 wherein q_(k) is determined by forming a first FFT of the matrix Ŝ, by forming a second FFT of the matrix Ŝ^(T), by forming a third FFT of d_(k), by multiplying the first, second, and third FFTs to produce a multiplication result, and by forming an inverse FFT of the multiplication result.
 10. The method of claim 1 wherein the performing of a conjugate gradient algorithm comprises determining a quantity q_(k) according to the following equation: q _(k) =Ŝ ^(T) Ŝd _(k), wherein d_(k) is dependent upon the received signal y, the matrix Ŝ, and the matrices F_(1 and F) ₂, and wherein q_(k) is determined by forming a first FFT of the matrix Ŝ, by forming a second FFT of the matrix Ŝ^(T), by forming a third FFT of d_(k), by multiplying the first, second, and third FFTs to produce a multiplication result, and by forming an inverse FFT of the multiplication result.
 11. The method of claim 1 wherein the forming of a matrix Ŝ from the data s comprises: forming a matrix S from the data s, wherein the matrix S contains the first, second, and third portions of the data s; and, forming the matrix Ŝ from the matrix S by setting the second and third portions of the data s to zero; wherein the forming of a matrix F₁ from the data s comprises: forming the matrix F₁ from the matrix S by setting the first and third portions of the data s to zero; and, wherein the forming of a matrix F₂ from the data s comprises: forming the matrix F₂ from the matrix S by setting the first and second portions of the data s to zero.
 12. The method of claim 11 wherein the performing of a conjugate gradient algorithm comprises determining a quantity q_(k) according to the following equation: q _(k) =Ŝ ^(T) Ŝd _(k), wherein d_(k) is dependent upon the received signal y, the matrix Ŝ, and the matrices F₁ and F₂, and wherein q_(k) is determined by forming a first FFT of the matrix Ŝ, by forming a second FFT of the matrix Ŝ^(T), by forming a third FFT of d_(k), by multiplying the first, second, and third FFTs to produce a multiplication result, and by forming an inverse FFT of the multiplication result.
 13. A method of processing a received signal y to produce a current channel estimate from one or more past channel estimates that include a previous channel estimate h₁ comprising: (a) decoding the received signal y to form data s; (b) forming a convolution matrix Ŝ from a first portion of the data s; (c) forming a matrix F₁ from a second portion the data s, wherein the second portion of the data s includes data that is less recent than the data in the first portion of the data s; (d) forming a matrix F₂ from a third portion the data s, wherein the third portion of the data s includes data that is more recent than the data in the first portion of the data s; (e) determining a predicted channel estimate h_(pred) based on the one or more past channel estimates; and, (f) performing a conjugate gradient algorithm to determine the current channel estimate, wherein the conjugate gradient algorithm is based on the received signal y, the matrix Ŝ, the matrices F₁ and F₂, the predicted channel estimate h_(pred), and the previous channel estimate h₁, wherein the conjugate gradient algorithm includes (i) forming FFTs based on the received signal y, the matrix Ŝ, and the matrices F₁ and F₂, (ii) multiplying the FFTs to form a multiplication product, and (iii) forming an inverse FFT of the multiplication product.
 14. The method of claim 13 wherein the determining of a predicted channel estimate h_(pred) comprises extrapolating the predicted channel estimate from k ones of the past channel estimates.
 15. The method of claim 14 wherein k≧2.
 16. The method of claim 13 wherein the received signal y, the matrix Ŝ, the matrices F₁ and F₂, the predicted channel estimate h_(pred), and the previous channel estimate h₁ are related according to the following equation: ŷ=y−F ₁ h ₁ −F ₂ h _(pred).
 17. The method of claim 16 wherein the conjugate gradient algorithm is performed to solve the following equation: ŷ=Sh wherein h comprises the current channel estimate.
 18. The equalizer of claim 16 wherein the performing of a conjugate gradient algorithm to determine the current channel estimate comprises performing the following algorithm: (1) ŷ=y−F₁h₁−F₂h_(pred), r₁=Ŝ^(T)ŷ−Ŝ^(T)Ŝh₁ (2) For k=1 to n, iteratively calculate (a) d_(k)=r_(k)+β_(k)d_(k−1) (b) h_(k+1)=h_(k)+α_(k)d_(k) (c) r_(k+1)=r_(k)−α_(k)q_(k−1) where β₁=0, ${\beta_{k \geq 2} = \frac{r_{k}^{T} \cdot r_{k}}{r_{k - 1}^{T} \cdot r_{k - 1}}},$ where ${\alpha_{k} = \frac{r_{k}^{T} \cdot r_{k}}{d_{k} \cdot q_{k}}},$ where q_(k)=S^(T)Sd_(k).
 19. The method of claim 18 wherein q_(k) is determined by forming a first FFT of the matrix Ŝ, by forming a second FFT of the matrix Ŝ^(T), by forming a third FFT of d_(k), by multiplying the first, second, and third FFTs to produce a multiplication result, and by forming an inverse FFT of the multiplication result.
 20. The method of claim 18 wherein the forming of a matrix Ŝ from the data s comprises: forming a matrix S from the data s, wherein the matrix S contains the first, second, and third portions of the data s; and, forming the matrix Ŝ from the matrix S by setting the second and third portions of the data s to zero; wherein the forming of a matrix F₁ from the data s comprises: forming the matrix F₁ from the matrix S by setting the first and third portions of the data s to zero; and, wherein the forming of a matrix F₂ from the data s comprises: forming the matrix F₂ from the matrix S by setting the first and second portions of the data s to zero.
 21. The method of claim 20 wherein q_(k) is determined by forming a first FFT of the matrix S, by forming a second FFT of the matrix Ŝ^(T), by forming a third FFT of d_(k), by multiplying the first, second, and third FFTs to produce a multiplication result, and by forming an inverse FFT of the multiplication result.
 22. The method of claim 13 wherein the performing of a conjugate gradient algorithm comprises determining a quantity q_(k) according to the following equation: q _(k) =Ŝ ^(T) Ŝd _(k), wherein d_(k) is dependent upon the received signal y, the matrix Ŝ, and the matrices F_(1 and F) ₂, and wherein q_(k) is determined by forming a first FFT of the matrix Ŝ, by forming a second FFT of the matrix Ŝ^(T), by forming a third FFT of d_(k), by multiplying the first, second, and third FFTs to produce a multiplication result, and by forming an inverse FFT of the multiplication result.
 23. The method of claim 13 wherein the forming of a matrix Ŝ from the data s comprises: forming a matrix S from the data s, wherein the matrix S contains the first, second, and third portions of the data s; and, forming the matrix Ŝ from the matrix S by setting the second and third portions of the data s to zero; wherein the forming of a matrix F₁ from the data s comprises: forming the matrix F₁ from the matrix S by setting the first and third portions of the data s to zero; and, wherein the forming of a matrix F₂ from the data s comprises: forming the matrix F₂ from the matrix S by setting the first and second portions of the data s to zero.
 24. The method of claim 23 wherein the performing of a conjugate gradient algorithm comprises determining a quantity q_(k) according to the following equation: q _(k) =Ŝ ^(T) Ŝd _(k), wherein d_(k) is dependent upon the received signal y, the matrix Ŝ, and the matrices F₁ and F₂, and wherein q_(k) is determined by forming a first FFT of the matrix Ŝ, by forming a second FFT of the matrix Ŝ^(T), by forming a third FFT of d_(k), by multiplying the first, second, and third FFTs to produce a multiplication result, and by forming an inverse FFT of the multiplication result. 